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Abstract 



Standard compressive sensing results state that to exactly recover an s sparse signal in W , one requires 
O(s-logp) measurements. While this bound is extremely useful in practice, often real world signals are not 
only sparse, but also exhibit structure in the sparsity pattern. We focus on group-structured patterns 
in this paper. Under this model, groups of signal coefficients are active (or inactive) together. The 
groups are predefined, but the particular set of groups that are active (i.e., in the signal support) must 
be learned from measurements. We show that exploiting knowledge of groups can further reduce the 
number of measurements required for exact signal recovery, and derive universal bounds for the number 
of measurements needed. The bound is universal in the sense that it only depends on the number of 
groups under consideration, and not the particulars of the groups (e.g., compositions, sizes, extents, 
overlaps, etc.). Experiments show that our result holds for a variety of overlapping group configurations. 

1 Introduction 

In many fields such as genetics, image processing, and machine learning, one is faced with the task of 
recovering very high dimensional signals from relatively few measurements. In general this is not possible, 
but fortunately many real world signals are, or can be transformed to be, sparse, meaning that only a 
small fraction signal coefficients are non-zero. Compressed Sensing [3J [3] allows us to recover sparse, high 
dimensional signals with very few measurements. In fact, results indicate that one only needs 0(s ■ logp) 
random measurements to exactly recover an s sparse signal of length p. 

In many applications however, one not only has knowledge about the sparsity of the signal, but some 
additional information about the structure of the sparsity pattern as well: 

• In genetics, the genes are arranged into pathways, and genes belonging to the same pathway are highly 
correlated with each other [22] . 

• In image processing, the wavelet transform coefficients can be modeled as belonging to a tree, with 
parent-child coefficients exhibiting similar properties [51 [3TJ [05] . 

• In wideband spectrum sensing applications, the spectrum typically displays clusters of non-zero fre- 
quency coefficients, each corresponding to a narrowband transmission [15j 



In cases such as these, the sparsity pattern can be represented as a union of certain groups of coefficients 
(e.g., coefficients in certain pathways, tree branches, or clusters). This knowledge about the signal structure 
can help further reduce the number of measurements one needs to exactly recover the signal. Indeed, the 
authors in |10j derive information theoretic bounds for the number of measurements needed for a variety of 
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signal ensembles, including trees. In, [U[7], the authors show that one needs far fewer measurements when 
the signal can be expressed as lying in a union of subspaces, and explicit bounds are derived when using a 
modified version of CoSaMP |17j to recover the signal. In this paper, we derive bounds on the number of 
random iid gaussian measurements needed to exactly recover a sparse signal when its pattern of sparsity lies 
in a union of groups, when solving the convex recovery algorithm introduced in 

We analyze the group-structured sparse recovery problem using a random Gaussian measurement model. We 
emphasize that although the derivation assumes the measurement matrix to be Gaussian, it can be extended 
to any subgaussian case, by paying a small constant penalty, as shown in [14]. We restrict ourselves to the 
Gaussian case here since it highlights the main ideas and keeps the analysis as simple as possible. 

Note that in this work, variables can be grouped into arbitrary sets, and we make no assumptions about the 
nature of the groups, except that they are known in advance. In short, we derive bounds for any generic 
group structure of variables, whether the groups overlap or form a partition of the ambient high dimensional 
space. 

To the best of our knowledge, these results are new and distinct from prior theoretical characterizations of 
group lasso methods. Asymptotic consistency results are derived for the group lasso when the groups partition 
the space of variables in [T]. Similarly, in [S], the authors consider the groups to partition the space, and 
derive conditions for recovery using the group lasso [25] . In [12l [13] , the authors derive consistency results 
for the group lasso under arbitrary groupings of variables. In [18] , the authors consider overlapping groups 
and derive sample bounds under the group lasso [25] setting. The authors in [TT] derive consistency results 
in an asymptotic setting, for the group lasso with overlap, but do not provide exact recovery results. The 
general group lasso scenarios is different from what we consider, in that the group lasso yields vectors whose 
support can be expressed as a complement of a union of groups, while we consider cases where we require 
the support to lie in a union of groups, a distinction made in [11]. Note that in the case of non-overlapping 
groups, the complement of a union of groups is a union of (a different set of) groups. In this paper, we (a) 
derive sample complexity bounds in a compressive-sensing framework when the measurement matrix is i.i.d. 
gaussian. (b) We focus on non-asymptotic sample bounds, and in a case where the support is contained in a 
union of groups, and (c) make no assumptions about the nature of groups. To derive our results, we appeal 
to the notion of restricted minimum singular values of an operator. 

We bound number of measurements needed for exact recovery with two terms. The first term grows linearly 
in the total number of non-zero coefficients (with a small constant of proportionality). This is close to the 
bare minimum of one measurement per non-zero component. The second term only depends on the number 
of groups under consideration, and not the particulars of the groups (e.g., compositions, sizes, extents, etc.). 
In particular, the groups need not be disjoint. The degree to which groups overlap, remarkably, has no effect 
on our bounds. In this regard, our bounds can be termed to be universal. This is somewhat surprising since 
overlapping groups are strongly coupled in the observations, tempting one to suppose that overlap may make 
recovery more challenging. 

Our main result shows that for signals with support on k of M possible groups, exact recovery is possible 
from (-y/2 log(Af — k) + y/~B) 2 k + kB measurements using an overlapping group lasso algorithm, B being the 
maximum group size. Note that the bound depends on the sparsity s of the signal via the kB term, which is 
a loose upper bound for s when the groups highly overlap. This arises as an artifact of the general approach 
we use to bound the number of measurements, and in specific cases, this can be made much tighter. 

Our proof derives from the techniques developed in [4]. The rest of this paper is organized as follows: in 
Section [2] we lay the groundwork for the main contribution of the paper, viz. applying the techniques from 
[3] tot he specific setting of group lasso with overlapping groups. We describe the theory and reasoning 
behind this approach. In Section [3] we derive bounds on the number of random i.i.d. gaussian measurements 
needed to be taken for exact recovery of group sparse signals. We further derive bounds for the number 
of measurements required for robust recovery of signals as well. Section [4] outlines the experiments we 
performed and the corresponding results obtained. We conclude our paper in Section [5] 
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1.1 Notations 



We first introduce notations that we will use for the rest of the paper. Consider a signal of length p, that is s 
sparse. Note here that in case of multidimensional signals like images, we assume they are vectorized to have 
length p. The coefficients of the signal are grouped into sets {Gi}f£ 1: such that Vi <E {1,2, • • • , M},Gi C 
{1, 2, • • • We denote the set of groups by Q = {Gi} i= i..j\/, and | • | denotes the cardinality of a set. We 
let x* be the (sparse) signal to be recovered, whose non zero coefficients lie in k of the M groups Q* C Q. 
Formally, 

G* = {G 4 G G* : supp(x*) HG^O} 

We assume = k < M = \Q\. We let $ nxp be a measurement matrix consisting of i.i.d. gaussian entries 
of mean and unit variance so that every column is a realization of an i.i.d. gaussian length n vector with 
covariance /. For any vector x G M. p , we denote by x G a vector in W such that (xa)i — Xi if i G G, and 
otherwise. We denote the observed vector by y G M. n : y = Qx* . The absence of a subscript following a 
norm || • || implies the £2 norm. The dual norm of || ■ || p is denoted by || • ||*. The convex hull of a set of 
points S is denoted by conv(S'). 

2 Preliminaries 

In this Section, we will set up the problem that we wish to solve in this paper. We will argue as to why exact 
recovery of the signal corresponds to the minimization of the atomic norm of the signal, with the atoms 
obeying certain properties governed by the signal structure. 

2.1 Atoms and the atomic set 

To begin with, let us formalize the notion of atoms and the atomic norm of a signal (or vector). We will 
restrict our attention to group-sparse signals in W, though the same concepts can be extended to other 
spaces as well. We assume that x G M. p can be decomposed as : 

k 

x = ^c l a tl c 4 > 

i=l 

The vectors <Zj are called atoms, and form the basic building blocks of any signal, which can be represented 
as a conic combination of the atoms. We denote A = {a^} to be the atomic set. Given a vector iel p and 
an atomic set, we define the atomic norm as 

IMU = inf S ^ c n : x = y~] c a a, c a > Va G A > (1) 

KaeA a£A ) 

The atomic decomposition of the signal has been known to be the simplest representation of the signal in 
some sense. Hence, to obtain a "simple" representation of a vector, we look to minimize the atomic norm 
subject to constraints (equation (|2|): 

x = argmin| \x\ \a s.t. y = <f>x (2) 

xeRp 

Indeed, when the atoms are merely the canonical basis in K p , the atomic norm reduces to the standard l\ 
norm, and minimization of the atomic norm yields the well known lasso procedure [23] . 

Assuming we are aware of the group structure g, we now proceed to define the atomic set and the corre- 
sponding atomic norm for our framework: 

VG G g, let 

A G - {a G G « p : \\(a G ) G \\ 2 = 1, (a G ) G . - 0} 
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A = {A G } Geg (3) 

We now show that the atomic norm of a vector x € W under the atomic set defined in equation ^ is 
equivalent to the overlapping group lasso norm defined in , a special case of which is the standard group 
lasso norm |25j . Thus, minimizing the atomic norm in this case is exactly the same as the group lasso with 
overlapping groups. 

Lemma 2.1 Given any arbitrary set of groups Q , we have 

\\x\U = ^ verlap (x) 
where ^o Ver i ap (x) is the overlapping group lasso norm defined in JHf - 

Proof In (JlJ, we can substitute v G — cca, giving us cq — \cg\ ■\\ a \\ = ll c G a ll = || u g||- Hence, 

if <^2c a : x =^2c a a c a > Va e A > 

KaeA aeA ) 



RU = in] 



^ = argmin & G overlap {x) s.t. y = $x (4) 



= inf <^2\\v G \\ : x = ^ v G \ 
I Gee Gee J 

= ^overlap( X ) I 

Corollary 2.2 Under the atomic set defined m[J], when Q partitions W, 

IMU = £lMI 

Gee 

Proof ®P overlap = Ecee I Noll in th e non overlapping case. | 
Thus, Q yields 

x = ; 

which can be solved using jllj . 

Also note that we can directly compute the dual of the atomic norm from the set of atoms 

||u||^4 = sup(a, u) — max ||ug|| (5) 
aeA G6 5 

The dual norm will be useful in our derivations below. 
2.2 Gaussian Widths and Exact Recovery 

Following 4J, we define the tangent cone and normal cone at x* with respect to conv(A) under ||x||.a as 
ED]: 

Ta(x*) = conc{z - x* : \\z\\ A < ||x*|U} (6) 
Af A (x*) = {u : (u, z) < 0, Vz e T A (x*)} (7) 
= {u : (u, x*) = t\\x\\A 
and ||u||^4 < t for some t > 0} 

We note that, from [J] (Prop. 2.1), x = x* ^ is unique iff 

null{<&) nT A (x*) = {0} (8) 



4 



Hence, we require that the tangent cone at x* intersects the nullspace of $ only at the origin, to guarantee 
exact recovery. 

Before we state the main recovery result from j4], we define the gaussian width of a set: 
Definition Let denote the unit sphere in M. p . The Gaussian width ui(S) of a set S £ S p_1 is 



u(S) = E 



sup g T z 



where g ~ Af(0, 1) 



Gordon uses the Gaussian width to provide bounds on the probability that a random subspace of a certain 
dimension misses a subset of the sphere [Hj. In [3], these results are specialized to the case of atomic norm 
recovery. In particular, we will make use of the following: 

Proposition 2.3 Corollary 3.2] Let $ : W — > R n be a random map with i.i.d. zero-mean Gaussian 
entries having variance 1/n. Further let fl = T A {x*) fl S p_1 denote the spherical part of the tangent cone 
Ta( x *)- Suppose that we have measurements y = <&x* , and we solve the convex program Then x* is the 
unique optimum of Q) with high probability provided that 

n > uj{fl) 2 +0(1). 

To complete our problem setup we will also restate Proposition 3.6 in [3] : 

Proposition 2.4 (Proposition 3.6 in Let C be any non-empty convex cone in MP, and let g ~ A/"(0, 1) 
be a Gaussian vector. Then: 

^(CnS^ 1 ) < E 9 [dist( 5 ,C*)] (9) 
where dist(., .) denotes the Euclidean distance between a point and a set, and C* is the dual cone of C 
We can then square ^ use Jensen's inequality to obtain 

oj(C n SP- 1 ) 2 < E 9 [dist( 5 , C*f] (10) 
We note here that the dual cone of the Tangent cone is the Normal cone, and vice-versa. 

Thus, to derive measurement bounds, we only need to calculate the square of the gaussian width of the 
intersection of the tangent cone at x* with respect to the atomic norm and the unit sphere. This value can 
be bounded by the distance of a gaussian random vector to the Normal cone at the same point, as implied 



by (10 1. In the next Section, we derive bounds on this quantity. 



3 Gaussian Width of the Normal Cone of the Group Sparsity 
Norm 

For generic groups Q, we have 

v £ N A (x*) 3 7 > :<«,a;*)=7||a:*|U,||i; G ||=7ifGee*, IMI < 7 if G # Q\ (11) 
It is not hard to see that, in the case of disjoint groups, 

Af A (x*) = {z G W : z t = 7 ^ VG G Q* , (12) 

If II 

\\z G \\ <7 VG i g*, 7 >0} 
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However, in the case of overlapping groups, we do not know how to obtain such a closed from. 

We now prove the main result of this paper, a sufficient number of gaussian measurements needed to recover 
a group-sparse signal: 

Theorem 3.1 To exactly recover a k- group sparse signal decomposed into M groups in W , (y2 log(M — k)+ 
VB) 2 k + kB iid gaussian measurements are sufficient. 

To prove this result, we need two lemmas: 

Lemma 3.2 Let qi, . . . , ql be L, x-squared random variables with d-degrees of freedom. Then 

E[ max qi] < (a/2 log(X) + Vd) 2 . 

KKL 



Proof Let Ml : = maxi<i<£ q^. For t > 0, we have that 

E[M L 



log[exp(t-E[M L ])] 
t 



W log[E[exp(£ ■ M L )]] 
t 

(«) log[E[maxi< 3 <L expft ■ qj) 
~ t 
(»«') log[LE[exp(t • qi )]] 



= log(L)-flog(l-2f) 
t 

Where (i) follows from Jensen's inequality , (ii) follows from the monotonicity of the exponential function, 
and (iii) merely bounds the maximum by the sum over all the elements. Now, setting t = (2 + 2e) _1 with 

£ = V 2Tof(Z) y dlds E i M L] < ( V^Og(L) + Vd) 2 i 

Note that t can be optimized depending on the application. We use this particular choice because it makes 
no assumptions about the relative magnitudes of {M — k) and B. 

Lemma 3.3 Suppose v £W is supported on some set of groups Q* C Q 

\\v\\<vm \n* A - 

Proof By duality, it suffices to show that \\z\\a < \/\G*\ \\z\\ for all z. For any z, there exists a representation 
z = X^Gee* ^ G wnere none of the supports of be overlap. It then follows that 

NU< E IIMI 

Geg* 

\ 1/2 

IV" 2 



Where (i) follows from the definition of the norm || • || .4 and (ii) is a consequence of the relation \\(3\\i < V~k 1 1 /3 1 1 2 
for k dimensional vectors j3 | 




Proof of Theorem 13.11 Intuition: Note that, from (10 1, we need to bound the distance between Af^x*) 
and a random gaussian vector. In the following proof, we carefully construct a specific vector r £ Af^(x*) 
and bound the distance from r to the gaussian vector. Naturally, this will be an upper bound to the distance 
desired. 



G 



Now, let S = UGee*G, i.e. S is the indices corresponding to the union of groups that support x*. Note that 
S C {1, 2, . . . ,p}. Since the normal cone is nonempty, let v £ A0i(x*) and let 

\H*A = 1 (13) 
then we must have that (v,x*) — \\x*\\a- Moreover, for each G that intersects S, \\vg\\ = 1- This follows 



from the definition in (11). Also, let vs<= — 0. It can be verified that v satisfies all the properties in (11). 

Let w ~ Af(0,I p ) be a vector with iid gaussian entries. We can write w — [ws ws c ] T - Let t(w) — 
max G ^. || w G || . 

Let us now construct a vector r £ Af^(x*). We can decompose r as r — [rg rgc] T '. Let r$ = t(w) ■ vg, and 
r S c = w S c ■ 



From (11), and from our definition of t(w), we have r G Ma(x*). Referring to (10), we now consider the 
expected squared distance between J\f^(x*) and w: 

E[dist(w,C*)] < E[||r-u;|| 2 ] 

= E[||r s - u>s|i 2 + ||r S c - w S o\\ 2 } 

l 2E[\\r s -w s \\ 2 } 

^E[||r5|| 2 ]+E[|K|| 2 ] 
= E[||tH-^|| 2 ]+E[|| Ws || 2 ] 

i = ) E[t(«;) 2 ].||«s|| a +E[|ks|| a ] 

^E^O'HKf + Isi 

v) 



< ( y / 2\og(M-k) + VB) 2 ■ \\v s \\ 2 + kB 

vi) 

< (V2 log(M -k) + VB) 2 -k + kB 

Where (i) follows because S and S c are disjoint, (ii) follows from the fact that rs and ws are independent, 
(iii) follows from the fact that v is deterministic. We obtain (iv) since ||uis|| 2 is a x 2 random variable with 
\S\ degrees of freedom, (v) follows from Lemma 3.2 and from the fact that kB is a upper bound on the 
signal sparsity. Finally, (vi) follows from Lemma 3.3 noting that \Q*\ < k, and |H|^ = 1 from (13). | 



If the groups are disjoint to begin with , the the normal cone will be given by (12), and ||i\s|| = Also, 



in this case, we have = kB. We see that we do not pay an additional penalty when the groups overlap, 
except that the bound for the signal sparsity becomes loose. This fact is surprising, since one would expect 
that one would need more measurements to effectively capture the dependencies among the overlapping 
groups. 



3.1 Remarks 

The kB term in the bound is an upper-bound on the signal sparsity. In the case of highly overlapping 
groups, this value may be much larger than the signal sparsity. This is an unfortunate artifact of the general 
approach we take to derive a bound on the number of measurements. If the specific structure of groups is 
known (trees, hierarchies, etc.), one can refine the bound accordingly. Of course, the bound will be tightest 
when there is a block-sparse structure, i.e. there is no overlap between groups. 

It can be seen from Theorem |3. 1| that the number of measurements is linear in k and B. Hence, the number 
of measurements that are sufficient for signal recovery grows linearly with the number of active groups in 
the signal, and also the maximum group size. 
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We note that although we pay no extra price to measure the signal when the groups overlap, there is 
an additional cost in the recovery process of the signal, in that the groups need to first be separated by 
replication of the coefficients [11] , or resort to a primal-dual method to solve the problem [16] . 



3.2 Noisy Observations 

The results we obtain can be easily extended to the case where we obtain noisy observations, assuming that 
the noise is bounded. In the noisy case, we observe 

y = $x* + 9, \\0\\<5 

We then solve the atomic norm minimization problem, with a relaxed constraint to take into account the 
bounded noise: 

x = argmin||a;m s.t. \\y — < S (14) 

We restate corollary 3.3 from [4]: 

Proposition 3.4 [W, Corollary 3.3] Let 3> : W — > R n be a random map with i.i.d. zero-mean Gaussian 
entries having variance 1/n. Further let fl = T^(x*) H S p_1 denote the spherical part of the tangent cone 
Ta( x *)- Suppose that we have measu rem ents y = &x* +8, and \\9\\ < S. Suppose we solve the convex program 



with high probability provided that 



(14-) Let x denote the optimum of (14-)- Also, suppose \\<&z\\ > e|jz|| Vz G T^(x*). Then \\x* — x\\ < 



2rt" 



Substituting the result in Theorem |3.1| in Proposition [^4) we have the following corollary yielding a sufficient 
condition to accurately recover a signal when the measurements are corrupted with bounded noise: 

Corollary 3.5 Suppose we wish to recover a k— group sparse signal having M groups, such that the maxi- 
mum group size is B. Let x be the optimum of the convex program (14-)- To have \\x — a;*|| < — with high 
probability, 

(y/2 log(M -k) + VB) 2 k + kB 
iid Gaussian measurements are sufficient. 



4 Experiments and Results 

We extensively tested our method against the standard lasso procedure. In the case where the groups overlap, 

we use the replication method outlined in |llj . to reduce the problem to that of non overlapping groups. 

We compare the number of measurements needed for our method with that needed for the lasso. For the 
lasso, we use the bound derived in [3] , viz. (2s + 1) log(p — s). We generate length p = 2000 signals, made 

up of M = 100 non-overlapping groups of size B = 20. We set k — 5 groups to be "active" , and the values 

within the groups are drawn from a uniform [0, 1] distribution. The active groups are assigned uniformly at 

random. The sparsity of the signal will be s = 100 

We use SpaRSA [23] for the lasso and the group lasso with overlap, learning A over a grid. Figure [l] displays 
the mean reconstruction error | \x — x*|||/p as a function of the number of random measurements taken. The 
errors have been averaged over 100 tests, and each time a new random signal was generated with the above 
mentioned parameters. 

From the parameters considered, we conclude that « 380 measurements are sufficient to recover the signal. 
Note that, when we have 380 measurements, the lasso does not recover the signal exactly, as seen in Figure 

m 
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200 300 400 
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600 



Figure 1: Comparison with the lasso. The vertical line indicates our bound. Note that our bound (380) 
predicts exact recovery of the signal, while at the same value, the lasso does not recover the signal 



To show that the bound we compute holds regardless of the complexity of groupings, we consider the following 
scenario: Suppose we have M = 100 groups, each of size B = 40. k = 5 of those groups are active, and the 
values within each group are assigned from a uniform [—1,1] distribution. We arrange these groups in three 
configurations: 

(i) The groups do not overlap, yielding a signal of length p = 4000, and signal sparsity s = 200. 

(ii) A partial overlapping scenario, where apart from the first and last group, every group has 20 elements in 
common with a group above it, and 20 common with the group below, giving p = 2020, s € [120, 200] 
depending on which of the 100 groups are active. 

(iii) An almost complete overlap, where apart from one element in each group, the remaining elements are 
common to each group. This leads to p = 139 and s = 44 

(iv) We also considered cases intermediate to the ones listed above. Specifically, we considered (a) a highly 
overlapping scenario which is identical to the previous case, but with odd and even groups disjoint. 
We also consider (b) a random overlap case where the first 50 groups are non overlapping and the 
remaining 50 are assigned uniformly at random from the existing p = 2000 indices. 



The scenarios we consider are depicted in Figure 2(a) In each of the cases, we compute the bound to be 
~ 630. We can see from Figure [2(b) | that the bound holds for all cases. The bound becomes looser as the 
complexity of the groupings increases. This, as argued before, is a result of the bound for the signal sparsity 
becoming looser. From the values of p and s computed for the three cases, we have the corresponding bounds 
for the lasso to be 3305 for the no overlap case (i), [1819, 3010] for the partial overlap case (ii) and 405 for 
the almost complete overlap case (iii)respectively. The lasso bounds for case (iv) will lie between those for 
case (ii) and (iii). 



Finally, we consider the wavelet transform coefficients of the "blocks" signal (Figure 4(a) ). It was shown in 
[H] that the coefficients can be grouped, to account for parent child dependencies across scales of the wavelet 
transform, as shown in Figure [3| In this case, for a p = 16384 length signal, we have M = 16382 groups, 
and the maximum group size is B — 2. We use the Haar wavelet bases to decompose the image. Figure 
|4(b)| shows the reconstruction obtained from 1690 measurements, corresponding to the bound computed for 
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(a) types of groupings considered. Each set (b) performance on cases considered in figure 12 (a) 

of coefficients encompassed by one color be- 
longs to one group 



Figure 2: (Best seen in color) Performance on various grouping schemes. Note that our bound evaluates 
to 630, clearly sufficient measurements to recover the signal. The corresponding bounds for the lasso (for 
cases (i), (ii) and (hi)) are 3305, [1819, 3010] (depending on s) and 405 respectively. We can see that, as the 
amount of overlap increases, our bound loosens, and for pathological cases the lasso bound is tighter 

k = 47. We see that our bound yields a sufficient number of measurements for exact recovery. 




Figure 3: Groups on the Id wavelet transform 



5 Conclusion 

We showed that, when additional structure about the support of the signal to be estimated is known, we 
can recover the signal in much fewer measurements that what would be needed in the standard compressed 
sensing framework. Also, we showed that we surprisingly do not pay an extra penalty when the groups 
overlap each other. Moreover, the bound holds for arbitrary group structures, and can be used in a variety 
of applications. The bounds we derive are tight, and can be extended to subgaussian measurement matrices 
by incurring a constant penalty. Experimental results on both toy and real data agree with the bounds we 
obtained. 
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Figure 4: Exact reconstruction of a length 16384 signal from 1690 measurements in the wavelet domain 
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